Goto

Collaborating Authors

 eptember 12


Learning functions through Diffusion Maps

Gomez, Alvaro Almeida

arXiv.org Artificial Intelligence

We propose a data-driven method for approximating real-valued functions on smooth manifolds, building on the Diffusion Maps framework under the manifold hypothesis. Given pointwise evaluations of a function, the method constructs a smooth extension to the ambient space by exploiting diffusion geometry and its connection to the heat equation and the Laplace-Beltrami operator. To address the computational challenges of high-dimensional data, we introduce a dimensionality reduction strategy based on the low-rank structure of the distance matrix, revealed via singular value decomposition (SVD). In addition, we develop an online updating mechanism that enables efficient incorporation of new data, thereby improving scalability and reducing computational cost. Numerical experiments, including applications to sparse CT reconstruction, demonstrate that the proposed methodology outperforms classical feedforward neural networks and interpolation methods in terms of both accuracy and efficiency.


AquaCast: Urban Water Dynamics Forecasting with Precipitation-Informed Multi-Input Transformer

Abdollahinejad, Golnoosh, Baghersalimi, Saleh, Constantinescu, Denisa-Andreea, Shevchik, Sergey, Atienza, David

arXiv.org Artificial Intelligence

This work addresses the challenge of forecasting urban water dynamics by developing a multi-input, multi-output deep learning model that incorporates both endogenous variables (e.g., water height or discharge) and exogenous factors (e.g., precipitation history and forecast reports). Unlike conventional forecasting, the proposed model, AquaCast, captures both inter-variable and temporal dependencies across all inputs, while focusing forecast solely on endogenous variables. Exogenous inputs are fused via an embedding layer, eliminating the need to forecast them and enabling the model to attend to their short-term influences more effectively. We evaluate our approach on the LausanneCity dataset, which includes measurements from four urban drainage sensors, and demonstrate state-of-the-art performance when using only endogenous variables. Performance also improves with the inclusion of exogenous variables and forecast reports. To assess generalization and scalability, we additionally test the model on three large-scale synthesized datasets, generated from MeteoSwiss records, the Lorenz Attractors model, and the Random Fields model, each representing a different level of temporal complexity across 100 nodes. The results confirm that our model consistently outperforms existing baselines and maintains a robust and accurate forecast across both real and synthetic datasets.


Evaluation of real-time transcriptions using end-to-end ASR models

Arriaga, Carlos, Pozo, Alejandro, Conde, Javier, Alonso, Alvaro

arXiv.org Artificial Intelligence

Automatic Speech Recognition (ASR) or Speech-to-text (STT) has greatly evolved in the last few years. Traditional architectures based on pipelines have been replaced by joint end-to-end (E2E) architectures that simplify and streamline the model training process. In addition, new AI training methods, such as weak-supervised learning have reduced the need for high-quality audio datasets for model training. However, despite all these advancements, little to no research has been done on real-time transcription. In real-time scenarios, the audio is not pre-recorded, and the input audio must be fragmented to be processed by the ASR systems. To achieve real-time requirements, these fragments must be as short as possible to reduce latency. However, audio cannot be split at any point as dividing an utterance into two separate fragments will generate an incorrect transcription. Also, shorter fragments provide less context for the ASR model. For this reason, it is necessary to design and test different splitting algorithms to optimize the quality and delay of the resulting transcription. In this paper, three audio splitting algorithms are evaluated with different ASR models to determine their impact on both the quality of the transcription and the end-to-end delay. The algorithms are fragmentation at fixed intervals, voice activity detection (VAD), and fragmentation with feedback. The results are compared to the performance of the same model, without audio fragmentation, to determine the effects of this division. The results show that VAD fragmentation provides the best quality with the highest delay, whereas fragmentation at fixed intervals provides the lowest quality and the lowest delay. The newly proposed feedback algorithm exchanges a 2-4% increase in WER for a reduction of 1.5-2s delay, respectively, to the VAD splitting.


Uncertainty Quantification in Seismic Inversion Through Integrated Importance Sampling and Ensemble Methods

Qu, Luping, Araya-Polo, Mauricio, Demanet, Laurent

arXiv.org Machine Learning

Seismic inversion is essential for geophysical exploration and geological assessment, but it is inherently subject to significant uncertainty. This uncertainty stems primarily from the limited information provided by observed seismic data, which is largely a result of constraints in data collection geometry. As a result, multiple plausible velocity models can often explain the same set of seismic observations. In deep learning-based seismic inversion, uncertainty arises from various sources, including data noise, neural network design and training, and inherent data limitations. This study introduces a novel approach to uncertainty quantification in seismic inversion by integrating ensemble methods with importance sampling. By leveraging ensemble approach in combination with importance sampling, we enhance the accuracy of uncertainty analysis while maintaining computational efficiency. The method involves initializing each model in the ensemble with different weights, introducing diversity in predictions and thereby improving the robustness and reliability of the inversion outcomes. Additionally, the use of importance sampling weights the contribution of each ensemble sample, allowing us to use a limited number of ensemble samples to obtain more accurate estimates of the posterior distribution. Our approach enables more precise quantification of uncertainty in velocity models derived from seismic data. By utilizing a limited number of ensemble samples, this method achieves an accurate and reliable assessment of uncertainty, ultimately providing greater confidence in seismic inversion results.


InVAErt networks: a data-driven framework for model synthesis and identifiability analysis

Tong, Guoxiang Grayson, Long, Carlos A. Sing, Schiavazzi, Daniele E.

arXiv.org Machine Learning

In the simulation of physical systems, an increase in model complexity directly corresponds to an increase in the simulation time, posing substantial limitations to the use of such models for critical applications that depend on timesensitive decisions. Therefore, fast emulators learned by data-driven architectures and integrated in algorithms for the solution of forward and inverse problems are becoming increasingly successful. On one hand, several contributions in the literature have proposed architectures for physics-based emulators designed to limit the number of model evaluations during training. These include, for example, physics-informed neural networks (PINN) [1], deep operator networks (DeepONet) [2], and transformers-based architectures [3]. On the other hand, generative approaches have been the subject of significant recent research due to their flexibility to quantify uncertainty in the predicted outputs. Unlike traditional deep learning tasks, generative models focus on capturing a distributional characterization of the latent variables, providing an improved understanding, and a superior way to interact with a given system. Examples in this context include Gaussian Processes [4], Bayesian networks [5], generative adversarial networks (GAN) [6], diffusion models [7], optimal transport [8], normalizing flow [9, 10] and Variational Auto-Encoders (VAE) [11]. When using data-driven emulators in the context of inverse problems, other difficulties arise. Inverse problem are often ill-posed as a result of non-uniqueness of solutions, or of ill-conditioning due to high-dimensionality, data-sparsity, noise-corruption, and nonlinear response of the physical systems [12, 13, 14, 15, 16].